feat(memory): host-managed memory lifecycle — two-lane retrieval + after-turn record (mem0 flow) by BenKurrek · Pull Request #5326 · nearai/ironclaw

BenKurrek · 2026-06-26T12:26:46Z

Summary

Implements the mem0 host-managed memory flow on top of #5205 (reborn/memory-lift-followups), confined to the native memory provider + run-level orchestration. Memory now reaches the model proactively (the loop previously hardcoded memory_snippets: Vec::new()), and each completed turn is recorded back so the short-term lane is populated.

on_run_start:  long_term  = search(user scope)
               short_term = search(thread scope)
before_model:  prompt = system + long_term + short_term + conversation
after_turn:    record_interaction(messages=[user, assistant], run_id, metadata)

What changed

Retrieval (once per run)

Native retrieve_context scopes by invocation.scope.thread_id: Some(T) → only that thread's threads/<T>/ subtree (short-term); None → the user's general memory, excluding threads/* (long-term). Disjoint lanes → the host concatenates them (short-term first) under the existing 512 B/snippet + 4 KiB-aggregate admission.
ProductionMemoryPromptContextService::load_memory_snippets fetches both lanes once (long-term via ResourceScope::without_thread_and_mission()); ThreadBackedLoopContextPort caches the result in a per-run OnceCell and surfaces it into the "memory" prompt section on every model step.
Wired composition → loop_driver_host factory → port, mirroring user_profile_source. Graceful degradation everywhere — memory never breaks a turn.

Recording (after each turn) — "provider decides"

New low-level MemoryService::record_interaction(invocation, { messages, run_id, metadata }) — the mem0 add data shape (user_id/agent_id/thread_id ride the invocation scope). A default no-op trait impl lets each provider opt in: the host passes the DATA and the provider decides what to do with it (store verbatim, run LLM extraction, or nothing). Implements the reserved memory.interaction.record.v1 vocabulary.
The native provider stores the full turn history under threads/<thread_id>/log.md.
AfterTurnMemoryRecorder fires at turn_run_executor::apply_exit (gated on Completed), reads the exchange from the thread transcript with the owner-rewritten scope, and hands it down. Post-terminal, best-effort — every error is debug!-logged and never fails the already-completed run.

Testing

Unit + caller-level across ironclaw_memory{,_native}, host_runtime, loop_support, turns, reborn:

native two-lane scoping (short-term thread filter; long-term exclusion); host two-lane fetch + per-lane degradation + combined budget; prompt rendering from memory_snippets; once-per-run cache (load_loop_context fetch count stays 1); native record_interaction → retrieve_context surfaces it; and a full-turn record through the executor (turn_runner_worker_records_after_turn_memory_on_completed_run).
cargo clippy clean on the touched crates; ironclaw_reborn_cli builds. (3 sandbox_process tests fail locally only without Docker; 1 pre-existing unused import: Path warning lives in untouched reborn_cli.)

Known items / follow-ups

Production graph wires None — reads + writes resolve on the local-dev runtime path only; the production graph degrades to no memory (issue Reborn: wire production-graph composition for optional context sources (identity + profile) #5013, same as user_profile_source).
Full-composition e2e (record in one run → surface in a later run's model request) not yet written; the individual links are unit + caller tested.
Budget tuning: the combined block is short-term-first, capped at max_snippets — a scratch-heavy thread can starve the long-term lane (per-lane sub-budgets are a later refinement).
Phase 3 (on_run_end durable summary) intentionally deferred.

Design doc: docs/superpowers/specs/2026-06-25-reborn-memory-host-lifecycle-design.md

🤖 Generated with Claude Code

…ter-turn record (mem0 flow) Implements the mem0 host-managed memory flow on top of nearai#5205, with the surface area confined to the native memory provider and run-level orchestration. Retrieval (once per run): - The loop fetches long-term (user-general) and short-term (per-thread, `threads/<thread_id>/`) memory once at the first prompt build of a run and injects both into the prompt's "memory" section, replacing the dead `memory_snippets: Vec::new()` in loop_support. A per-run OnceCell caches the fetch; subsequent model steps reuse it. - Native `retrieve_context` scopes by `invocation.scope.thread_id`: Some(T) → only that thread's `threads/<T>/` subtree (short-term); None → the user's general memory, excluding `threads/*` (long-term). The lanes are disjoint, so the host concatenates them (short-term first) under the existing 4 KiB admission budget. - Graceful degradation throughout: a memory failure degrades the lane to empty and never breaks a turn. Recording (after each turn): - New low-level `MemoryService::record_interaction(invocation, { messages, run_id, metadata })` — the mem0 `add` data shape (`user_id`/`agent_id`/ `thread_id` ride the invocation scope). A default no-op trait impl lets each provider opt in: the host passes the DATA and the provider decides what to do with it (store verbatim, run LLM extraction, or nothing). Implements the reserved `memory.interaction.record.v1` vocabulary. - The native provider stores the full turn history under `threads/<thread_id>/`. - A host `AfterTurnMemoryRecorder` fires at the run-end seam (`turn_run_executor::apply_exit`, gated on `Completed`), reads the exchange from the thread transcript with the owner-rewritten scope, and hands it down. Post-terminal and best-effort: every error is `debug!`-logged and never fails the already-completed run. Reads and writes resolve on the local-dev runtime path; the production graph wires `None` (deferred, issue nearai#5013 — the same optionality as `user_profile_source`). Tested at unit + caller level across ironclaw_memory{,_native}, host_runtime, loop_support, turns, and reborn (two-lane fetch, prompt rendering, once-per-run cache, native record→retrieve, and a full-turn record through the executor). A full-composition e2e (record in one run → surface in a later run's model request) is called out as a follow-up. Design: docs/superpowers/specs/2026-06-25-reborn-memory-host-lifecycle-design.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-26T12:26:53Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (2)

staging
reborn-integration

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: fb5e3b42-9994-4965-a391-7141b7603fe4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

gemini-code-assist

Code Review

This pull request implements a host-managed memory lifecycle (mem0 flow) for the Reborn planned loop, introducing proactive memory retrieval across short-term and long-term lanes, as well as after-turn interaction recording. The feedback is highly constructive and identifies several key areas for optimization and robustness: concurrently fetching memory lanes to reduce latency, preserving distributed tracing correlation IDs, wrapping background interaction recording in a timeout to prevent blocking worker threads, filtering out empty messages before storage, and ensuring failed memory fetches are cached as empty to avoid repeated timeout penalties on subsequent iterations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-26T12:29:36Z

+        let cached = self
+            .memory_snippets_cache
+            .get_or_try_init(|| async {
+                let Some(request) = self.build_memory_prompt_context_request(context_messages)
+                else {
+                    return Ok(Vec::new());
+                };
+                service.load_memory_snippets(request).await
+            })
+            .await;
+        match cached {
+            Ok(snippets) => snippets.clone(),
+            // A retrieval failure must never break the turn: degrade to empty.
+            // The cell stays uninitialized, so a later iteration may retry.
+            Err(error) => {
+                tracing::debug!(
+                    kind = ?error.kind,
+                    "memory context fetch failed; degrading to empty memory for this run"
+                );
+                Vec::new()
+            }
+        }


Using get_or_try_init leaves the OnceCell uninitialized if the memory service fails. Consequently, every subsequent iteration (model step) in the same run will attempt to fetch memory again, repeatedly hitting timeouts/failures and severely degrading performance. Using get_or_init and caching an empty vector on failure ensures we only attempt to fetch memory exactly once per run, preventing repeated latency penalties.

let cached = self .memory_snippets_cache .get_or_init(|| async { let Some(request) = self.build_memory_prompt_context_request(context_messages) else { return Vec::new(); }; match service.load_memory_snippets(request).await { Ok(snippets) => snippets, Err(error) => { tracing::debug!( kind = ?error.kind, "memory context fetch failed; degrading to empty memory for this run" ); Vec::new() } } }) .await; cached.clone()

gemini-code-assist · 2026-06-26T12:29:36Z

+        let long_term_invocation = MemoryInvocation {
+            scope: short_term_invocation.scope.without_thread_and_mission(),
+            correlation_id: CorrelationId::new(),
+        };


Reusing the same correlation_id is important for distributed tracing and log correlation. Creating a new CorrelationId for the long-term lane breaks the correlation of the two lanes under the same request.

Suggested change

let long_term_invocation = MemoryInvocation {

scope: short_term_invocation.scope.without_thread_and_mission(),

correlation_id: CorrelationId::new(),

};

let long_term_invocation = MemoryInvocation {

scope: short_term_invocation.scope.without_thread_and_mission(),

correlation_id: short_term_invocation.correlation_id.clone(),

};

gemini-code-assist · 2026-06-26T12:29:36Z

+        let mut combined = self
+            .retrieve_lane(
+                short_term_invocation,
+                request.query.clone(),
+                request.max_snippets,
+                context_profile_id.clone(),
+                MemoryLane::ShortTerm,
            )
-            .await
-            .map_err(map_memory_service_error)?;
+            .await;
+        combined.extend(
+            self.retrieve_lane(
+                long_term_invocation,
+                request.query,
+                request.max_snippets,
+                context_profile_id,
+                MemoryLane::LongTerm,
+            )
+            .await,
+        );


The short-term and long-term lane retrievals are independent I/O-bound operations. Running them sequentially adds unnecessary latency to the prompt generation path. We can run them concurrently using tokio::join! to significantly reduce latency.

let (mut combined, long_term) = tokio::join!( self.retrieve_lane( short_term_invocation, request.query.clone(), request.max_snippets, context_profile_id.clone(), MemoryLane::ShortTerm, ), self.retrieve_lane( long_term_invocation, request.query, request.max_snippets, context_profile_id, MemoryLane::LongTerm, ) ); combined.extend(long_term);

gemini-code-assist · 2026-06-26T12:29:36Z

+    let user_content = history
+        .messages
+        .iter()
+        .find(|message| {
+            message.kind == MessageKind::User && message.turn_run_id.as_deref() == Some(run_id)
+        })
+        .and_then(|message| message.content.as_deref())?;
+    let assistant_content = history
+        .messages
+        .iter()
+        .find(|message| {
+            message.kind == MessageKind::Assistant
+                && message.status == MessageStatus::Finalized
+                && message.turn_run_id.as_deref() == Some(run_id)
+        })
+        .and_then(|message| message.content.as_deref())?;


If the user or assistant message content is empty or contains only whitespace, recording it wastes storage and pollutes the memory index. Filtering out empty/whitespace-only content ensures only meaningful interactions are recorded.

let user_content = history .messages .iter() .find(|message| { message.kind == MessageKind::User && message.turn_run_id.as_deref() == Some(run_id) }) .and_then(|message| message.content.as_deref()) .filter(|content| !content.trim().is_empty())?; let assistant_content = history .messages .iter() .find(|message| { message.kind == MessageKind::Assistant && message.status == MessageStatus::Finalized && message.turn_run_id.as_deref() == Some(run_id) }) .and_then(|message| message.content.as_deref()) .filter(|content| !content.trim().is_empty())?;

References

When canonicalizing a string, perform cheaper validation checks (like length, emptiness, and character set) on a trimmed slice before performing more expensive operations like replace that allocate a new string.

gemini-code-assist · 2026-06-26T12:29:36Z

+        if let Err(error) = self
+            .memory_writer
+            .record_interaction(invocation, request)
+            .await
+        {
+            debug!(error = %error, "after-turn memory: record_interaction failed; run already complete");
+        }


Since record_interaction is a best-effort background task, awaiting it directly in the worker's execution path without a timeout can block the worker indefinitely if the memory provider hangs or is extremely slow. Wrapping the call in a timeout ensures the worker remains responsive.

let record_result = tokio::time::timeout( std::time::Duration::from_secs(5), self.memory_writer.record_interaction(invocation, request), ) .await; match record_result { Ok(Err(error)) => { debug!(error = %error, "after-turn memory: record_interaction failed; run already complete"); } Err(_) => { debug!("after-turn memory: record_interaction timed out; run already complete"); } _ => {} }

BenKurrek · 2026-06-26T12:33:37Z

Superseded by #5327 — reopened as a same-repo PR on nearai (this one was opened from the BenKurrek fork by mistake). Same branch/commit.

github-actions Bot added size: XL 500+ changed lines risk: low Changes to docs, tests, or low-risk modules scope: docs Documentation scope: dependencies Dependency updates contributor: experienced 6-19 merged PRs labels Jun 26, 2026

gemini-code-assist Bot reviewed Jun 26, 2026

View reviewed changes

BenKurrek closed this Jun 26, 2026

BenKurrek deleted the reborn/memory-lifecycle branch June 26, 2026 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(memory): host-managed memory lifecycle — two-lane retrieval + after-turn record (mem0 flow)#5326

feat(memory): host-managed memory lifecycle — two-lane retrieval + after-turn record (mem0 flow)#5326
BenKurrek wants to merge 1 commit into
nearai:reborn/memory-lift-followupsfrom
BenKurrek:reborn/memory-lifecycle

BenKurrek commented Jun 26, 2026

Uh oh!

coderabbitai Bot commented Jun 26, 2026

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

BenKurrek commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BenKurrek commented Jun 26, 2026

Summary

What changed

Testing

Known items / follow-ups

Uh oh!

coderabbitai Bot commented Jun 26, 2026

Review skipped

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

BenKurrek commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant